Collagen

 Collagen, the most abundant protein in vertebrates, is an extracellular, inextensible fibrous protein that comprises the major protein component of such stress-bearing structures as bones, tendons, and ligaments. As with all fibrous proteins collagen is, for the most part, characterized by highly repetitive simple sequence. Here we study two model compounds (The structure of 4CLG is shown in the applet to the right.) for naturally occurring collagen, in order to develop an understanding of the fibrous portion of collagen and to show how the different levels of protein structure come together and form a highly ordered and stable fiber. Collagen's properties of rigidity and inextensibility are due to this highly ordered structure. The part of collagen without structural order is not illustrated in this model. This part of the protein complex having a different amino acid composition, lysine and hydroxylysine are particularly important residues, is globular in nature and not as structurally organized. Lysine and hydroxylysine form covalent crosslinks in the protein complex, thereby adding strength and some flexibility to the fiber. This covalent crosslinking continues throughout life and produces a more rigid collagen and brittle bones in older adults. Go to Collagen Structure & Function for information on the functions and disorders of collagen and a link in the External Links section of this page for assembly movies of the triple helix of types I and IV.

Structure of a Segment
A fiber segment is made up of 5 tropocollagens, each is shown in a different color. One limitation of this model of collagen segment is that instead of having flush cut ends as shown here, the ends of the tropocollagen in an actual fiber section would be staggered. This staggered pattern is produced when the tropocollagens associate to form the fiber segment. The collagen fiber is constructed by connecting the segments together, and the presence of these staggered ends permits the tropocollagens from different segments to form strong attractions adding to the strength of the fiber. Add tropocollagens one at a time to form the fiber section, two, three , four , five. View fiber segment as backbone only. Viewing the segment from the end one can see that without the side chains being displayed the center of the fiber is empty. Each tropocollagen molecule contains 3 parallel peptide chains wrapped around one another to make a right-handed triple helix that is 87 Å long and ~10 Å in diameter. Tropocollagen displayed as backbone only.

Primary Structure of Peptide
Show side chains of the peptide in wireframe display. Identify the amino acids making up the peptide by resting the cursor on a residue and observing the name in the label (Toggling spin off will make this easier.). Which three amino acids are present in the peptide in a reocurring pattern? Collagen is characterized by a distinctive repeating sequence: (Gly-X-Y)n where X is often Pro, Y is usually 5-hydroxyproline (Hyp), and n may be >300. The model (4CLG) being studied here contains a repeating sequence of residues - Gly - Pro - Hyp. This sequence produces a conformation which is a left-handed helix with a rise 10.0 Å/turn or 3.3 residues per turn, the peptide is colored in three residue segments. <scene name='Collagen/Peptide_helix_z_axis/1'>Looking down the center axis of a segment of the helix. Since a helix with a larger rise is superimposed on the helix described above, the entire center axis does not align for viewing. The <scene name='Collagen/Ramachandran/2'>Ramachandran plot shows that the psi and phi angles of the collagen helix are different from the α-helix, which has a rise of 3.6. The two clusters shown here are outside of the area expected for an α-helix. Review where you would expect a cluster of α-helix residues to be located.

Other Levels of Structure
As shown above tropocollagen is formed by <scene name='Collagen/One_tropocollagen/1'>three peptides twisting around each other, and in doing so the peptides make <scene name='Collagen/Peptide_3_residue_segments2/2'>one turn every ~7 three-residue repeats (Cyan colored residues mark the approximate length of one turn.). <scene name='Collagen/One_tropocollagen2/1'>Three cyan colored residues mark the approximate distance of one turn of the peptides in a tropocollagen. Tropocollagen displayed as <scene name='Collagen/One_tropocollagen_backbone2/1'>backbone only clearly shows both types of helical turns - the 3.3 residue/turn and ~21 residue/turn.

Looking down the axis of a tropocollagen displayed as wireframe, <font color="#ff0000">glycine can be seen <scene name='Collagen/Gly_position_tropo/2'>positioned in the center of the triple helix. The two types of helical turns consistently positions the Gly in the center of the triple helix. Proline and the hydroxyproline are on the <scene name='Collagen/Pros_position_tropo/1'>outside of the triple helix. With the hydroxyl group of Hyp extending to the surface of the triple helix, it can be involved in hydrogen bond formation, as will be seen in the next section. The cyclical side chains of Pro and Hyp are some what rigid, and this rigidity adds to the stability of the collagen fiber. The primary structure of repeating Gly-Pro-Hyp along with the two types of helical turns determine the 3D positions of Gly, Pro and Hyp in the tropocollagen.

In order to make a compact strong fiber the interior residues of the triple helix need to be close packed. The <scene name='Collagen/Gly_no_hindrance/1'>Gly side chain is the only one small enough to accommodate this close packing in the interior of the triple helix. (Realize that in this model the hydrogen on the α carbon is not displayed.) <scene name='Collagen/Glys_close_pack/1'>Three Gly, one on each of three different chains, are close packed together. The gray atoms of the yellow and lime Gly are the α-carbons, and only a hydrogen could fit between these carbons and the atoms of the adjacent Gly. <scene name='Collagen/Glys_pro_close/2'>A Pro on each of the 3 chains are shown close packed to the three Gly (lime, cyan, yellow). Adding the <scene name='Collagen/Glys_pro_hyp/1'>Hyp shows that Pro and Hyp are tightly positioned around the small interior Gly leaving no space for side chains longer than the single hydrogen of Gly.

Intra-tropocollagen Attractions
Intra-tropocollagen attractions are primarily hydrogen bonds formed between the peptides in the triple helix. The three polypeptide chains are <scene name='Collagen/Intra-hbonds/4'>staggered in position by one residue, that is, a Pro on Chain A is at the same level along the triple helix axis as a <font color="#ff0000">Gly on Chain B and a Hyp on Chain C. This staggered arrangement not only <scene name='Collagen/Intra-hbonds2/6'>aligns a <font color="#ff0000">Gly main chain NH (imino group) with a Pro main chain O (carbonyl oxygen) on one of the other peptides but also brings the two groups close enough to form a <scene name='Collagen/Intra-hbonds6/2'>hydrogen bond between the imino hydrogen and the carbonyl oxygen. This alignment occurs with Gly in each of the three peptides so that the Gly hydrogens of Chain A form <scene name='Collagen/Intra-hbonds3/3'>hydrogen bonds ( orange ) with the Pro carbonyl oxygens on Chain B, and likewise Gly of <scene name='Collagen/Intra-hbonds4/2'>Chain B to Pro of Chain C ( yellow ) and Gly of <scene name='Collagen/Intra-hbonds5/1'>Chain C to Pro of Chain A ( green ). The force of these hydrogen bonds extending the length of the tropocollagen add up to a strong attractive force which mantain the integrity of the tropocollagen. Since the main chain N atoms of both Pro and Hyp residues lack H atoms, only Gly can provide hydrogen to form these hydrogen bonds.

Inter-tropocollagen Attractions
Hydrogen bonds are also an important inter-tropocollagen force which holds the tropocollagens together in the fiber segment. As shown above, Hyp is the outer most residue on the <scene name='Collagen/Pros_position_tropo/1'>surface of the triple helix, and the hydroxyl groups are the atoms that extend out the most from the surface. The hydrogen bonds are formed between the hydroxyl hydrogen of a Hyp and a backbone carbonyl oxygen. As the peptides in a tropocollagen twist about each other they come into <scene name='Collagen/Hlite_c_k_peptides/1'>close contact with particular peptides in adjacent tropocollagens and then move away from them. The two peptide highlighted in spacefill are located in two different tropocollagens. Notice that in this case, they make contact with each other in the middle of the strands, and a hydrogen bond is located at this point of contact. The <scene name='Collagen/Inter-hbonds1/2'>hydrogen bond consist of the oxygen of a carbonyl of a Hyp in a <font bold="" color="blue"> peptide of one tropocollagen and the hydroxyl hydrogen of a Hyp in a  peptide  of another tropocollagen. Another example shows <scene name='Collagen/Hlite_k_o/1'>two peptides from two different tropocollagens making contact at the ends of the fiber segment, and of course it is within these regions where the inter-tropocollagen attractions occur. At one end a <scene name='Collagen/Inter-hbond2/4'>hydrogen bond is formed between a hydrogen of Hyp in one peptide  and an oxygen of a Gly carbonyl in the second  peptide. At the other end of the two peptides a <scene name='Collagen/Inter-hbond3/2'>Hyp carbonyl oxygen donates its electrons to a Hyp hydroxyl hydrogen. Show the <scene name='Collagen/2nd_view_hbond3/2'>hydrogen bond in the context of the six peptides of the two tropocollagens. The above examples of hydrogen bonding illustrate that Hyp plays a central role in maintaining the structures of both the tropocollagen and the collagen fiber. Without the proper amount of vitamin C in their diets humans can not make Hyp, and therefore can not make stable collagen and strong bones.

Effect of a Mutation
The mutation being considered is an Ala replacing a Gly. Synthetic model PDB ID: 1CAG is <scene name='Collagen/1cag/7'>tropocollagen whose peptides contain thirty residues and have a <scene name='Collagen/Collagen_chain_1cag/4'>sequence of (Pro-Hyp-Gly)4-Pro-Hyp-Ala-(Pro-Hyp-Gly)5 (Ala displayed as large wireframe and colored as  ). Viewing 1CAG from the side of the fiber shows: the <scene name='Collagen/1cag1/1'>Gly is only partially visible because it is buried in the interior, <scene name='Collagen/1cag2/1'>Pro being much more visible is positioned closer to the surface, <scene name='Collagen/1cag3/1'>Hyp being entirely on the surface is clearly visible, and <scene name='Collagen/1cag4/1'>Ala being a substitute for Gly is only partially visible.

The <scene name='Collagen/1cag_surface/4'>surface of the tropocollagen is shown with the Ala appearing as olive and the Pro and Hyp adjacent to the Ala appearing as dark brown. Notice that the surface at these Pro and Hyp bulges slightly. This protrusion is due to the fact that the packing about the Ala side chains is not as close as it is about the Gly. In the two side-by-side scenes shown below compare the amount of open space between the chains in the area of the scene center. In the 1CAG scene in the area of the Ala the distance between the chains is slightly greater than that of 4CLG scene.

</StructureSection>

In order to convince yourself that there is a difference in the interchain distances in the area of the Ala, <scene name='Collagen/1cag_measurements/2'>show distances between Gly (Ala) and Pro which form intratropocollagen hydrogen bonds. Hydrogen bonds are not formed between Ala and Pro because the distances between the atoms forming the bonds are too great. The absence of the intratropocollagen hydrogen bonds, which is due to replacing Gly with a residue having a longer side chain, disrupts collagen's rope-like structure and is responsible for the symptoms of such human diseases as osteogenesis imperfecta and certain Ehlers-Danlos syndromes.

3D structures of collagen
Update June 2011

3hqv, 3hr2 – Col I – rat – fiber diffraction

1q7d - hCol I α1 integrin-binding domain – human

1u5m - hCol II α1 (mutant) 3dmw - hCol III α1

1li1 - hCol IV α Nc1 domain

1t60, 1t61, 1m3d - Col IV α Nc1 domain – bovine

1kth - hCol III α3 Kunitz type domain

1kun - hCol III α3 Kunitz type domain – NMR<BR /> 2knt, 1knt - hCol VI Kunitz type domain

1o91 - mCol VIII α1 Nc1 domain - mouse

2uur - hCol IX α1 Nc4 domain

1gr3 - hCol X α1 Nc1 domain

1b9p, 1b9q - Col IX α1 Nc4 domain (mutant)

3n3f – hCol XIV Nc1 domain

1dy2 - mCol XV endostatin domain

3hon, 3hsh - hCol XVIII tetramerization domain

1bnl - hCol XVIII C terminal domain

1dy0, 1dy1 - mCol XVIII endostatin domain

2ekj, 2ee3 - hCol XX α1 fn3 domain

2dkm - hCol XX α1 fn3 domain - NMR<BR /> 3ipn – Col modified

1wzb, 1itt, 1k6f – Col triple helix

1zpx, 1sp7, 1sop – Col mini – hydra – NMR<BR /> 2cuo, 2d3f, 2d3h, 2g66 – Col model peptides

Collagen complex with binding proteins
3ejh – hCol I α1 C-terminal + fibronectin

2fse – hCol II + MHC HLA-DR1

2seb - hCol II + MHC HLA-DR4

2v53 - hCol III α1 + Sparc

2wuh – hCol + discoidin domain receptor 2

1dzi – Col + integrin α2 domain

2f6a – Col + Col adhesin

Contributor
Much of the content of this page was taken from an earlier non-Proteopedia version of Collagen which was in large part developed by Gretchen Heide Bisbort, a 1999 graduate of Messiah College.